Features of Gene Extraction by Nonlinear Support Vector Machines in Gene Expression Analysis

نویسندگان

  • Daisuke Komura
  • Hiroshi Nakamura
  • Shuichi Tsutsumi
  • Hiroyuki Aburatani
  • Sigeo Ihara
چکیده

Statistical analysis on gene expression data from DNA microarray has enabled us to extract information from tissue and cell samples. Comparing two classes of gene expression datasets (e.g. datasets from normal tissues and cancerous tissues), we first choose discriminative genes, which have significantly different expression values between two classes and characterize each class. In the most case of statistical filter methods such as t-test, the chosen genes tend to be strongly expressed in one class and weakly in the other class. However, the lower ranked genes by the simple filter methods may also include discriminative and informative genes. The expression values of such genes have complicated distributions. Suppose there are two classes whose distribution is as depicted in Figure 1: there are two peaks of frequency of gene expression values in one class, while the peak in the other class is located between them. In this case, although such a gene is of importance, it is low ranked and thus discarded by conventional filter methods because of its low inner-class average and its high deviation. In supervised classification problems, various wrapper methods such as Recursive Feature Elimination (RFE) [1] are proposed for feature selection. Since the main purpose of the wrapper method is to improve the performance of the classification algorithm, the genes chosen by the method have not been paid attention, especially in the nonlinear classification. In this paper, we use the wrapper methods based on a nonlinear classification algorithm in order to extract the discriminative genes that difficult to be extracted by conventional filter methods. RFE method based on nonlinear Support Vector Machines (SVMs) [2] is employed to this end because it is successfully applied to classification of gene expression data. We investigate the genes extracted by the RFE method based on SVMs with gaussian kernel function to indicate that it can extract discriminative genes which are not chosen by conventional filter methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feature Selection and Classification of Microarray Gene Expression Data of Ovarian Carcinoma Patients using Weighted Voting Support Vector Machine

We can reach by DNA microarray gene expression to such wealth of information with thousands of variables (genes). Analysis of this information can show genetic reasons of disease and tumor differences. In this study we try to reduce high-dimensional data by statistical method to select valuable genes with high impact as biomarkers and then classify ovarian tumor based on gene expression data of...

متن کامل

Application of Gene Expression Programming and Support Vector Regression models to Modeling and Prediction Monthly precipitation

Estimating and predicting precipitation and achieving its runoff play an important role to correct management and exploitation of basins, management of dams and reservoirs, minimizing the flood damages and droughts, and water resource management, so they are considered by hydrologists. The appropriate performance of intelligent models leads researchers to use them for predicting hydrological ph...

متن کامل

Multivariate Feature Extraction for Prediction of Future Gene Expression Profile

Introduction: The features of a cell can be extracted from its gene expression profile. If the gene expression profiles of future descendant cells are predicted, the features of the future cells are also predicted. The objective of this study was to design an artificial neural network to predict gene expression profiles of descendant cells that will be generated by division/differentiation of h...

متن کامل

Expression of Brucella abortus Omp25 Protein in Lactococcus lactis Probiotic Bacteria

Background and purpose: The sequence of Omp25 is conserved in all Brucella species. The high antigenicity of the product of this gene stimulates the host’s immune system. Using engineered probiotic bacteria is an appropriate method for vaccine transport. The aim of this study was to express the Omp25 of the Brucella abortus pathogenic bacterium in Lactococcus lactis probiotic bacterium. Materi...

متن کامل

Cloning rhoptry protein 1 (ROP1) gene of Toxoplasma gondii (RH) in expression vector

  Toxoplasma gondii contain various immunogenic antigens. The most important Toxoplasma antigens are somatic and excreted/secreted antigens. Rhoptry proteins are known as excreted/secreted antigens. These antigens have been proposed as a vaccine candidate against toxoplasmosis. The main objective of the present work was cloning rhoptry protein1 (ROP1) Gene of Toxoplasma gondii (RH) in a cloning...

متن کامل

Face Recognition using Eigenfaces , PCA and Supprot Vector Machines

This paper is based on a combination of the principal component analysis (PCA), eigenface and support vector machines. Using N-fold method and with respect to the value of N, any person’s face images are divided into two sections. As a result, vectors of training features and test features are obtain ed. Classification precision and accuracy was examined with three different types of kernel and...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003